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I Abstract. This paper addresses the online exact string matching prob- 

lem which consists in finding all occurrences of a given pattern p in a text 
t. It is an extensively studied problem in computer science, mainly due to 
its direct applications to such diverse areas as text, image and signal pro- 
cessing, speech analysis and recognition, data compression, information 
' retrieval, computational biology and chemistry. Since 1970 more than 80 

, string matching algorithms have been proposed, and more than 50% of 

them in the last ten years. In this note we present a comprehensive list of 
^ I all string matching algorithms and present experimental results in order 

to compare them from a practical point of view. ^From our experimental 
evaluation it turns out that the performance of the algorithms are quite 
. ^ different for different alphabet sizes and pattern length. 

in 

^ ! 1 Introduction 

(N 

O 



Given a text t of length n and a pattern p of length m over some alphabet S 
of size cr, the string matching problem consists in finding all occurrences of the 
pattern p in the text t. It is an extensively studied problem in computer science, 
mainly due to its direct applications to such diverse areas as text, image and 
, signal processing, speech analysis and recognition, data compression, information 

^ • retrieval, computational biology and chemistry. 

String matching algorithms are also basic components used in implementa- 
tions of practical softwares existing under most operating systems. Moreover, 
they emphasize programming methods that serve as paradigms in other fields of 
computer science. Finally they also play an important role in theoretical com- 
puter science by providing challenging problems. 

Applications require two kinds of solutions depending on which string, the 
pattern or the text, is given first. Algorithms based on the use of automata or 
combinatorial properties of strings are commonly implemented to preprocess the 
pattern and solve the first kind of problem. This kind of problem is generally 
referred as online string matching. The notion of indexes realized by trees or 
automata is used instead in the second kind of problem, generally referred as 
offline string matching. In this paper we are only interested in algorithms of the 
first kind. 

The worst case lower bound of the online string matching problem is 0{n) and 
has been firstly reached by the well known Morris-Pratt algorithm [MP70]. An 
average lower bound in C(n log m/m) (with equiprobability and independence 
of letters) has been proved by Yao in [Yao79] . 

More than 80 online string matching algorithms (hereafter simply string 
matching algorithms) have been proposed over the years. All solutions can be 
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Fig. 1. The list of all comparison based string matching algorithms (1970-2010). 
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Fig. 2. The list of the automata based string matching algorithms (1992-2009). 
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Fig. 3. The hst of all bit-parallel string matching algorithms (1992-2010). 



divided into two classes: algorithms which solve the problem by making use only 
of comparisons between characters, and algorithms which make use of automata 
in order to locate all occurrences of the searched string. The latter class can 
be further divided into two classes: algorithms which make use of deterministic 
automata and algorithms based on bit-parallelism which simulate the behavior 
of non-deterministic automata. 

Fig. 1, Fig. 2 and Fig. 3 present the list of all string matching algorithms 
based on comparison of characters, deterministic automata and bit-parallelism, 
respectively. 

The class of algorithms based on comparison of characters is the wider class 
and consists of almost 50 per cent of all solutions. Among the comparison based 
string matching algorithms the Boyer-Moore algorithm [BM77] deserves a special 
mention, since it has been particularly successful and has inspired much work. 

Also automata play a very important role in the design of efficient string 
matching algorithms. The first linear algorithm based on deterministic automata 
is the Automaton Matcher [CLRSOl]. 

Over the years automata based solutions have been also developed to de- 
sign algorithms which have optimal sublinear performance on average. This is 
done by using factor automata [BBE+83,Cro85,BBE+85,ACR99], data struc- 
tures which identify all factors of a word. Among the algorithms which make 
use of a factor automaton the BDM [CR94] and the Backward-Oracle-Matching 
algorithm [ACR99] are among the most efficient solutions, especially for long 
patterns. 

In recent years, most of the work has been devoted to develop software tech- 
niques to simulate efficiently the parallel computation of non-deterministic finite 
automata related to the search pattern. Such simulations can be done efficiently 
using the bit-parallelism technique [BYG92] , which consists in exploiting the in- 
trinsic parallelism of the bit operations inside a computer word. In some cases, 
bit-parallelism allows to reduce the overall number of operations up to a factor 
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Fig. 4. Number of algorithms proposed in the last 21 years (1990-2010) 

equal to the number of bits in a computer word. Thus, although string matching 
algorithms based on bit-parallelism are usually simple and have very low memory 
requirements, they generally work well with patterns of moderate length only. 

The bit-parallelism technique has been used to simulate efficiently the non- 
deterministic version of the Morris-Pratt automaton. The resulting algorithm, 
named Shift-Or [BYG92], runs in 0{n\m/w']), where w is the number of bits in a 
computer word. Later, a variant of the Shift-Or algorithm, called Shift- And, and 
a very fast BDM-like algorithm (BNDM), based on the bit-parallel simulation 
of the non-deterministic suffix automaton, were presented in [NR98b] . 

Bit-parallelism encoding requires one bit per pattern symbol, for a total of 
[m/w] computer words. Thus, as long as a pattern fits in a computer word, 
bit-parallel algorithms are extremely fast, otherwise their performance degrades 
considerably as \m/w~\ grows. Though there are a few techniques to maintain 
good performance in the case of long patterns, such limitation is intrinsic. 

Fig. 4 presents a plot of the number of algorithms (for each class) proposed 
in the last 21 years (1990-2010). Observe that the number of proposed solutions 
have doubled in the last ten years, demonstrating the increasing interest in this 
issue. It is interesting to observe also that almost 50 per cent of solutions in the 
last ten years are based on bit-parallelism. Moreover it seems that the number 
of bit-parallel solutions proposed in the years follows an increasing trend. 

In the rest of the paper we present a comprehensive experimental evaluation 
of all string matching algorithms listed above in order to compare them from a 
practical point of view. 
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2 Experimental Results 

We present next experimental data which allow to compare in terms of running 
time all the algorithms listed in Fig. 1, Fig. 2 and Fig. 3. 

In particular we tested the Hashg algorithm with q equal to 3, 5 and 8. The 
AOSO and BNDMg algorithms have been tested with a value of q equal to 2, 4 
and 6. Finally the SBNDM^? and UFNDM^? have been tested with q equal to 2, 
4, 6 and 8. 

All algorithms have been implemented in the C programming language and 
were used to search for the same strings in large fixed text buffers on a PC 
with Intel Core2 processor of 1.66GHz and running times have been measured 
with a hardware cycle counter, available on modern CPUs. The codes have been 
compiled with the GNU C Compiler, using the optimization options -02 -fno- 
guess-branch-probability. 

In particular, the algorithms have been tested on the following 12 text buffers: 

(i) eight Randcr text buffers, for = 2, 4, 8, 16, 32, 64, 128 and 256, where each 
RandfT text buffer consists in a 5Mb random text over a common alphabet 
of size cr, with a uniform distribution of characters; 

(ii) a genome sequence of 4, 638, 690 base pairs of Escherichia coli (with cr = 4); 

(iii) a protein sequence (the hs file) from the Saccharomyces cerevisiae genome, 
of length 3, 295, 751 byte (with ct = 20); 

(iv) the English King James version of the Bible composed of 4, 047, 392 char- 
acters (with (T = 63); 

(v) the file worldl92.txt (The CIA World Fact Book) composed of 2,473,400 
characters (with cr = 94); 

Files (ii), (iv) and (v) are from the Large Canterbury Corpus (http://www. data- 
compression. info/Corpora/CanterburyCorpus/), while file (iii) is from the Protein 
Corpus (http://data-compression.info/Corpora/ProteinCorpus/). 

For each input file, we have generated sets of 400 patterns of fixed length m 
randomly extracted from the text, for m ranging over the values 2, 4, 8, 16, 32, 
64, 128, 256, 512 and 1024. For each set of patterns we reported in a table the 
mean over the running times of the 400 runs. Running times are expressed in 
thousandths of seconds. 

Moreover we color each running time value with different shades of blue-red. 
In particular better results are presented in tones verging to red while worse 
results are presented in tones verging to blue. In addition best results arc high- 
lighted with a light gray background. 

Although we tested more than 85 different algorithms, for the sake of clear- 
ness we include in the following tables only the algorithms that obtain, for each 
text buffer and each pattern length, the 25 best results. We add a red marker 
to comparison based algorithms, while a green and a blue marker is added to 
automata and bit parallel algorithms, respectively. 

Then, for each table, we briefiy discuss the performance of the string matching 
algorithms by referring to the following four classes of patterns: 

— very short patterns (pattern with m < 4); 

— short patterns (pattern with 4 < m < 32); 

— long patterns (pattern with 32 < m < 256); 

— very long patterns (pattern with m > 256); 

Finally we discuss the overall performance of the tested algorithms by considering 
those algorithms which maintain good performance for all classes of patterns. 
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2.1 Experimental Results on Rand2 Problem 

In this section wc present experimental results on a random text buffer over a 
binary alphabet. Matching binary data is an interesting problem in computer sci- 
ence, since binary data are omnipresent in telecom and computer network appli- 
cations. Many formats for data exchange between nodes in distributed computer 
systems as well as most network protocols use binary representations. 
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In the case of very short patterns the SO and SA algorithms obtain the best 
results. The AUT algorithm obtains also good results. For short patterns the al- 
gorithms based on bit-parallelism achieves good results. The A0S02 algorithm 
is the best for patterns of length 8, while HASH^/ algorithms obtain best results 
for patterns of length 16 and 32. In the case of long patterns the best results arc 
obtained by the HASHf? algorithms and by the SSEF algorithm (for patterns of 
length 256). For very long patterns the best results are obtained by the SSEF 
algorithm. Regarding the overall performance no algorithm maintains good per- 
formances for all patterns. However when the pattern is short the SA algorithm 
is a good choice while the HASH5 and the SSEF algorithms are suggested for 
patterns with a length greater than 16. 
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2.2 Experimental Results on Rand4 Problem 

Matching data over four characters alphabet is an interesting problem in com- 
puter science mostly related with computational biology. It is the case, for in- 
stance, of DNA sequences which are constructed over an alphabet of four bases. 
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In the case of very short patterns the SA and SO algorithms obtain the best 
results. For short patterns the algorithms based on bit-parallelism achieve bet- 
ter results, in particular BNDMq4 and SBNDMq4. Other algorithms like HASH5, 
HASH8, EBOM and SEBOM are quite competitive. In particular the HASH3 al- 
gorithm obtains the best results for patterns of length 8. In the case of long 
patterns the best results are obtained by the SSEF algorithm. However the al- 
gorithm in the EBOM family are good choices. Among the algorithm base on 
character comparisons the HASH5 and HASH8 algorithms achieve good results. 
Among the algorithms based on bit-parallelism the SBNDMq4 maintains quite 
competitive performance. For very long patterns the best results are obtained by 
the SSEF, HASH8 and BOM algorithms. Finally the algorithms EBOM maintains 
very good performance for all patterns. 
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2.3 Experimental Results on RandS Problem 



In this section wc present experimental results on a random text buffer over an 
alphabet of eight characters. 



m 


2 


4 


8 


16 


32 


64 


128 


256 


512 


1024 


• KR 


22.4 


17.8 


16.4 


16.4 


16.4 


16.4 


16.4 


16.4 


16.4 


16.4 


• ZT 


36.0 


18.3 


9.86 


5.84 


3.79 


2.96 


3.26 


3.02 


2.94 


2.94 


• QS 


19.6 


13.3 


8.91 


6 


88 


6.16 


6.14 


6.10 


6.12 


6.17 


6.26 


• TunBM 


22.5 


13.1 


8.59 


6.65 


6.28 


6.21 


6.12 


6.19 


6.19 


6.12 


• NSN 


20.5 


22.5 


22. 1 


22.1 


22.1 


22.1 


22.1 


22.1 


22.2 


22. 1 


• Raita 


20.4 


11.3 


7.38 


5 


65 


5.33 


5 


25 


5 


23 


5.29 


5 


18 


5.17 


• RCol 


18 8 


11 1 


7 23 


5 


47 


4 97 


4 


67 


4 


50 


4.44 


4 


01 


3 81 


• BR 


16.8 


11.6 


7.47 


4 


75 


3.24 


2 


76 


3 


39 


2.97 


2 


93 


2.95 


• FS 


18.9 


11.1 


7.25 


5 


54 


4.95 


4 


62 


4 


45 


4.35 


4 


00 


3.79 


• FFS 


18.7 


11.2 


7.23 


5 


24 


4.34 


3 


61 


3 


61 


3.36 


2 


97 


2.85 


• BFS 


18.9 


11.1 


7.08 


5 


15 


4.30 


3 


61 


3 


64 


3.44 


3 


06 


3.02 


• TS 


18.6 


15.9 


12.2 


8 


75 


6.13 


4 


84 


4 


22 


3.73 


3 


48 


3.33 


• SSABS 


16.5 


11.6 


8.34 


6 


74 


6.33 


6 


32 


6 


28 


6.29 


6 


26 


6.17 


•TVSBS 


14.9 


10.5 


6.91 


4 


50 


3.23 


2 


82 


3 


17 


2.96 


2 


95 


2.92 


• FJS 


17.9 


12.8 


9.36 


7 


61 


7.19 


7 


13 


6 


99 


7.09 


7 


14 


7.15 


•HASH3 




19.1 


7.25 


3 


88 


2.66 


2 


46 


2 


75 


2.60 


2 


45 


2.38 


•HASH5 






12.2 


4 


79 


2.71 


2 


41 


2 


78 


2.06 


1 


64 


1.47 


•HASH8 






" 


7 


61 


3.45 


2 


46 


2 


85 


1.96 


1 


45 


1.30 


• TSW 


19.3 


13.5 


8.80 


5 


69 


3.91 


3 


07 


3 


84 


3.28 


3 


24 


3.24 


•GRASPm 


21.5 


12.4 


7.94 


5 


84 


4.76 


3 


84 


4 


06 


3.35 


2 


70 


2.17 


• SSEF 


- 


- 


- 






5.39 


3 


36 


3 


43 


1.79 


1 


00 


0.55 


AUT 


22.3 


22.3 


21.7 


22.3 


22.4 


21.7 


21.8 


21.9 


22.6 


23.9 


RF 


34.5 


22.0 


12.6 


7.02 


4.31 


2.89 


3.47 


2.59 


2.87 


4.38 


BOM 


48.6 


33.3 


22.2 


15.1 


9.60 


5 


98 


5.11 


2.82 


1.60 


0.94 


. B0M2 


36.8 


23.1 


13.2 


7.16 


4.37 


2.82 


3.40 


1.96 


1.36 


2.41 


. ILDMl 


30.3 


20.1 


11.6 


6.38 


4.01 


2.94 


3.54 


3.29 


4.67 


8.20 


ILDM2 


31.9 


19.4 


10.8 


5.93 


3.77 


2.83 


3.53 


3.29 


4.65 


8.21 


. EBOM 


19.6 


8.37 


5.04 


3.70 


3.00 


2.63 


3.13 


1.90 


1.48 


2.65 


. FBOM 


17.4 


10.4 


6.72 


4.63 


3.45 


2.83 


3.30 


2.01 


1.52 


2.69 


SEBOM 


20.6 


8.73 


5.22 


3.82 


3.12 


2.76 


3.25 


2.02 


1.56 


2.72 


SFBOM 


17.2 


10.4 


6.77 


4.68 


3.49 


2.88 


3.33 


2.05 


1.56 


2.72 


• SO 


16.8 


16.8 


16.8 


16.8 


16.8 


21.8 


21.8 


21.7 


21.7 


21.8 


• SA 


16.4 


16.4 


16.4 


16.4 


16.4 


18.9 


18.9 


18.9 


1 


3.9 


18.9 


• BNDM 


37.3 


22.0 


11.6 


6 


10 


3.66 


4 


51 


4 


51 


4.52 


4 


52 


4.51 


• BNDM-L 


37.1 


21.9 


11.6 


6 


08 


3.67 


5 


48 


6 


86 


6.33 


5 


98 


6.26 


•SBNDM 


48.2 


17.8 


8.61 


5 


05 


3.24 


3 


09 


3 


08 


3.11 


3 


10 


3.09 


• TNDM 


29.9 


19.1 


10.9 


5 


89 


3.57 


3 


55 


3 


55 


3.54 


3 


53 


3.56 


•TNDMa 


27.0 


18.2 


11.2 


5 


93 


3.47 


3 


40 


3 


39 


3.39 


3 


38 


3.38 


• LBNDM 


39.7 


22.9 


12.8 


7 


08 


4.27 


2 


95 


4 


25 


3.86 


7 


02 


33.7 


•SBNDM2 


36.0 


13.5 


6.98 


4 


30 


3.01 


2 


79 


2 


79 


2.81 


2 


80 


2.80 


• SBNDM-BMH 


21.9 


14.0 


8.49 


5 


03 


3.21 


3 


08 


3 


11 


3.08 


3 


09 


3.09 


• BMH-SBNDM 


19.4 


11.0 


6.96 


5 


10 


4.08 


4 


22 


4 


25 


4.21 


4 


19 


4.18 


• A0S02 


58.0 


15.9 


9.79 


9 


72 


9.71 


8 


56 


8 


58 


8.56 


8 


57 


8.56 


• A0S04 




49.7 


11.1 


5 


15 


5.05 


4 


56 


4 


55 


4.57 


4 


57 


4.56 


• A0S06 






44.8 


9 


79 


3.53 


3 


31 


3 


31 


3.31 


3 


30 


3.30 


• FSBNDM 


28.1 


14.2 


7.85 


4 


71 


2.74 


2 


75 


2 


74 


2.74 


2 


74 


2.76 


•BNDMq2 


33.8 


12.8 


6.58 


4 


06 


2.84 


3 


41 


3 


44 


3.45 


3 


45 


3.44 


•BNDMq4 




48.4 


10.4 


4 


59 


2.57 


3 


16 


3 


15 


3.17 


3 


16 


3.16 


•BNDMq6 






24.0 


7 


22 


3.52 


3 


19 


3 


19 


3.18 


3 


18 


3.20 


• SBNDMq2 


33.5 


12.7 


6.72 


4 


25 


2.97 


2 


79 


2 


79 


2.79 


2 


81 


2.82 


• SBNDMq4 




45.8 


9.90 


4 


39 


2.56 


2 


46 


2 


46 


2.46 


2 


45 


2.46 


• SBNDMq6 






27.4 


8 


03 


3.75 


3 


21 


3 


22 


3.21 


3 


21 


3.21 


•UFNDMq4 


42.3 


21.1 


10.9 


6 


04 


3.53 


3 


54 


3 


53 


3.53 


3 


52 


3.54 


•KBNDM 


43.7 


22.0 


12.1 


7 


25 


4.72 


3 


10 


3 


58 


3.58 


3 


59 


3.58 



In the case of very short the best performance is obtained by the TVSBS and 
SSABS algorithms. Algorithms with very good performance are also FBOM and 
SFBOM. For short patterns the algorithms based on bit-parallelism achieve good 
resuhs, in particular BNDMq2, FSBNDM and SBNDM2. However the algorithms 
in the EBOM family are also good choices. In the case of long patterns the best 
results are obtained by the EBOM, HASH5 and SBNDMq4 algorithms. For very 
long patterns the best results are obtained by the SSEF algorithm. For the overall 
performance we notice that the algorithms int the EBOM family, and the TVSBS 
and FSBNDM algorithms maintain very good performance for all patterns. 
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2.4 Experimental Results on Randl6 Problem 

In this section wc present experimental results on a random text buffer over an 
alphabet of 16 characters. 
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18.6 


7.15 


3.88 


2.81 


2.55 


2.44 


2.83 


1.81 


1.42 


2.68 


FBOM 


13.3 


8.17 


5.10 


3.41 


2.79 


2.66 


3.20 


1.88 


1.45 


2.69 


SEBOM 


19.6 


7.57 


4.11 


2.94 


2.68 


2.56 


2.95 


1.93 


1.49 


2.74 


SFBOM 


13.3 


8.25 


5.18 


3.48 


2.87 


2.74 


3.28 


1.96 


1.50 


2.73 


• SO 


16.8 


16.8 


16.8 


16.8 


16.8 


22.1 


22.1 


22.1 


22.1 


22.1 


• SA 


16.4 


16.4 


16.4 


16.4 


16.4 


19.1 


19.1 


19.1 


19.1 


19.1 


•SBNDM 


48.1 


16.8 


7.71 


4.20 


2.61 


2.60 


2.59 


2.60 


2.60 


2.60 


• TNDM 


25.0 


14.9 


9.34 


5.32 


2.89 


2.87 


2.88 


2.89 


2.89 


2.88 


•TNDMa 


22.3 


13.6 


9.26 


5.65 


2.88 


2.83 


2.83 


2.82 


2.82 


2.84 


• LBNDM 


34.0 


19.0 


11.2 


6.31 


3.57 


2.62 


3.54 


2.47 


2.53 


4.08 


• SVMl 


18.5 


13.9 


16.4 


11.9 


9.20 


20.8 


20.8 


20.8 


20.8 


20.8 


•SBNDM2 


35.2 


12.5 


6.10 


3.45 


2.55 


2.44 


2.45 


2.45 


2.45 


2.44 


• SBNDM-BMH 


16.1 


9.90 


6.57 


4.17 


2.62 


2.59 


2.59 


2.60 


2.59 


2.60 


• BMH-SBNDM 


15.5 


8.42 


5.00 


3.35 


2.75 


2.84 


2.84 


2.85 


2.81 


2.84 


• FA0S0q2 


36.6 


12.4 


10.7 


10.7 


10.2 


10.2 


10.2 


10.2 


10.2 


10.2 


• A0S02 


34.1 


11.2 


9.73 


9.72 


9.74 


8.52 


8.52 


8.52 


8.53 


8.52 


• A0S04 




28.5 


6.57 


5.09 


5.10 


4.55 


4.57 


4.55 


4.55 


4.55 


• FSBNDM 


23.6 


12.1 


6.46 


3.73 


2.38 


2.38 


2.38 


2.39 


2.39 


2.37 


•BNDMq2 


33.3 


11.8 


5.61 


3.16 


2.48 


2.68 


2.67 


2.67 


2.68 


2.70 


•BNDMq4 




48.4 


10.4 


4.57 


2.57 


3.14 


3.14 


3.16 


3.15 


3.14 


• SBNDMq2 


32.6 


11.7 


5.70 


3.35 


2.50 


2.45 


2.44 


2.44 


2.44 


2.44 


• SBNDMq4 




45.7 


9.88 


4.36 


2.54 


2.44 


2.45 


2.45 


2.44 


2.44 



In the case of very short patterns the best results are obtained by the TVSBS 
and EBOM algorithms for patterns of length 2 and 4, respectively. For short 
patterns the algorithms EBOM is the fastest. However it is outperformed by the 
FSVBNDM algorithm for patterns of length 32. The FSBNDM algorithm is very 
fast also for long patterns but is outperformed by the HASH3 algorithm and by 
the SSEF algorithm for patterns of length 64 and 256, respectively. For very long 
patterns the best results are obtained by the SSEF algorithms. Regarding the 
overall performance the algorithm TVSBS, the algorithm BR and the algorithms 
in the EBOM family maintain very good performance for all patterns. 
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2.5 Experimental Results on Rand32 Problem 

In this section wc present experimental results on a random text buffer over an 
alphabet of 32 characters. 



m 


2 


4 


8 


16 


32 


64 


128 


256 


512 


1024 


• BM 


21.1 


11.2 


6.34 


3.88 


2.79 


2.55 


2.76 


2.74 


2.72 


2.68 


• KR 


17.8 


16.7 


16.4 


16.4 


16.4 


16.4 


16.4 


16.4 


16.4 


16.4 


• ZT 


29.8 


15.4 


8.29 


4.86 


3.08 


2.55 


2.87 


1.67 


1.07 


0.77 


• QS 


13.5 


8.57 


5.29 


3.43 


2.64 


2.48 


2.82 


2.71 


2.70 


2.70 


• TunBM 


14.9 


8.01 


4.58 


2.92 


2.50 


2.36 


2.57 


2.58 


2.57 


2.58 


• NSN 


15.0 


15.7 


15.7 


15.6 


15.6 


15.5 


15.7 


15.7 


15.7 


15.5 


• Raita 


15.9 


8.48 


4.84 


3.07 


2.52 


2.40 


2.63 


2.60 


2.61 


2.60 


• RCol 


13.8 


7.43 


4.27 


2.76 


2.43 


2.33 


2.53 


2.52 


2.48 


2.47 


• Skip 


23.6 


13.4 


8.19 


5.51 


3.80 


2.85 


4.78 


3.44 


2.58 


2.32 


• BR 


11.8 


8.20 


5.34 


3.50 


2.64 


2.51 


2.85 


1.67 


1.08 


0.79 


• FS 


13.8 


7.44 


4.27 


2.76 


2.43 


2.33 


2.53 


2.51 


2.49 


2.47 


• FFS 


13.8 


7.52 


4.33 


2.79 


2.44 


2.35 


2.64 


2.62 


2.47 


2.53 


• BFS 


13.8 


7.47 


4.29 


2.75 


2.44 


2.35 


2.65 


2.61 


2.49 


2.58 


• TS 


12.8 


12.1 


11.0 


9.22 


7.07 


4.98 


4.35 


3.58 


3.19 


3.10 


• SSABS 


10.6 


6.92 


4.43 


3.06 


2.58 


2.46 


2.66 


2.64 


2.65 


2.66 


•TVSBS 


10.2 


7.19 


4.73 


3.17 


2.62 


2.50 


2.68 


1.58 


1.02 


0.74 


• FJS 


10.5 


6.97 


4.49 


3.09 


2.61 


2.49 


2.68 


2.66 


2.66 


2.66 


•HASH3 




18.1 


6.68 


3.45 


2.45 


2.30 


2.63 


1.92 


1.55 


1.38 


•HASH5 






12.1 


4.72 


2.64 


2.38 


2.73 


1.85 


1.37 


1.22 


•HASH8 






" 


7.59 


3.44 


2.45 


2.85 


1.95 


1.45 


1.29 


•TSW 


13.6 


9.51 


6.29 


4.18 


3.05 


2.55 


3.36 


2.06 


1.40 


1.06 


•GRASPm 


15.4 


8.25 


4.69 


2.95 


2.48 


2.35 


2.61 


2.49 


2.09 


1.61 


• SSEF 


- 


- 


- 


- 


5.38 


3.38 


3.44 


1.78 


1.00 


0.54 


RF 


22.8 


13.0 


8.05 


5.29 


3.36 


2.57 


3.07 


2.42 


2.62 


4.16 


BOM 


37.8 


27.4 


24.6 


17.4 


11.4 


6.97 


5.31 


2.96 


1.73 


1.15 


B0M2 


24.0 


13.6 


8.47 


5.59 


3.48 


2.51 


3.00 


1.75 


1.27 


2.42 


• EBOM 


18.3 


6.87 


3.63 


2.67 


2.49 


2.41 


2.72 


1.71 


1.38 


2.69 


• FBOM 


11.8 


7.41 


4.61 


3.05 


2.67 


2.61 


2.91 


1.79 


1.46 


2.72 


.SEBOM 


19.4 


7.29 


3.85 


2.79 


2.61 


2.52 


2.83 


1.79 


1.45 


2.73 


SFBOM 


11.8 


7.48 


4.68 


3.14 


2.74 


2.67 


2.98 


1.89 


1.51 


2.75 


• SO 


16.8 


16.8 


16.8 


16.8 


16.8 


22.0 


22.1 


22.1 


22.1 


22.1 


• SA 


16.4 


16.4 


16.4 


16.4 


16.4 


19.1 


19.1 


19.1 


19.1 


19.1 


•SBNDM 


48.1 


16.6 


7.52 


4.00 


2.45 


2.61 


2.61 


2.61 


2.60 


2.61 


• LBNDM 


31.5 


16.9 


9.64 


5.81 


3.43 


2.46 


3.23 


1.98 


1.58 


1.52 


• SVMl 


15.9 


11.6 


15.4 


11.1 


8.70 


20.7 


20.7 


20.7 


20.7 


20.7 


•SBNDM2 


35.0 


12.3 


5.90 


3.27 


2.45 


2.39 


2.40 


2.39 


2.39 


2.39 


• SBNDM-BMH 


13.6 


7.81 


4.92 


3.45 


2.52 


2.60 


2.60 


2.60 


2.60 


2.61 


• BMH-SBNDM 


13.9 


7.42 


4.24 


2.73 


2.41 


2.44 


2.45 


2.44 


2.44 


2.44 


• FA0S0q2 


23.8 


11.1 


10.7 


10.7 


10.2 


10.2 


10.2 


10.2 


10.2 


10.2 


• FA0S0q4 




18.1 


6.12 


5.72 


5.46 


5.45 


5.45 


5.45 


5.45 


5.44 


• A0S02 


21.9 


10.1 


9.73 


9.73 


9.72 


8.52 


8.53 


8.52 


8.54 


8.53 


• A0S04 




16.9 


5.45 


5.09 


5.08 


4.55 


4.55 


4.55 


4.55 


4.54 


• A0S06 






15.1 


3.96 


3.57 


3.29 


3.29 


3.29 


3.29 


3.30 


• FSBNDM 


21.9 


11.2 


5.97 


3.43 


2.29 


2.28 


2.29 


2.31 


2.29 


2.30 


•BNDMq2 


33.1 


11.5 


5.37 


2.94 


2.40 


2.47 


2.48 


2.48 


2.48 


2.47 


•BNDMq4 




48.5 


10.5 


4.59 


2.58 


3.15 


3.15 


3.15 


3.14 


3.15 


• SBNDMq2 


32.4 


11.4 


5.46 


3.12 


2.42 


2.38 


2.37 


2.39 


2.39 


2.39 


• SBNDMq4 




45.7 


9.90 


4.37 


2.53 


2.44 


2.45 


2.45 


2.44 


2.44 


•UFNDMq2 


30.3 


15.5 


8.19 


4.52 


2.79 


2.79 


2.79 


2.79 


2.80 


2.80 


• DBWW 


19.3 


11.1 


6.90 


4.07 


4.07 


4.06 


4.07 


4.06 


4.07 


4.07 


• DBWW2 


19.1 


11.1 


6.85 


4.03 


4.03 


4.03 


4.04 


4.04 


4.05 


4.04 


•KBNDM 


40.2 


20.3 


10.6 


5.82 


3.49 


2.62 


3.01 


2.00 


2.00 


2.02 



In the case of very short patterns the TVSBS and the EBOM algorithms 
obtain the best results. For short patterns the algorithm EBOM is still the best 
algorithm. However it is outperformed by the FSBNDM algorithm for patterns 
of length 32. In the case of long patterns the algorithm FSBNDM achieves the 
best results when then length of the pattern is less than 256. For patterns of 
length 256 the best results are obtained by the TVSBS algorithm. For very long 
patterns the best results are obtained by the SSEF algorithm. For the overall 
performance the algorithms TVSBS and BR maintain very good performance for 
all patterns. 
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2.6 Experimental Results on Rand64 Problem 



In this section we present experimental results on a random text buffer over an 
alphabet of 64 characters. 



m 


2 


4 


8 


16 


32 


64 


128 


256 


512 


1024 


• BM 


20.3 


10.7 


5.87 


3.49 


2.52 


2.32 


2.72 


2.53 


2.49 


2.49 


• HOR 


24.9 


13.0 


7.05 


4.04 


2.69 


2.43 


3.14 


2.72 


2.68 


2.69 


• KR 


17.1 


16.5 


16.4 


16.4 


16.4 


16.4 


16.4 


16.4 


16.4 


16.4 


•ZT 


29.4 


15.4 


8.42 


5.01 


3.25 


2.61 


2.86 


1.61 


0.97 


0.61 


•QS 


12.6 


7.92 


4.80 


3.03 


2.45 


2.28 


2.97 


2.59 


2.55 


2.57 


• TunBM 


13.7 


7.29 


4.09 


2.60 


2.34 


2.20 


2.60 


2.42 


2.40 


2.38 


• NSN 


14.3 


14.5 


14.5 


14.6 


14.5 


14.5 


14.5 


14.5 


14.4 


14.5 


• Raita 


15.4 


8.12 


4.55 


2.77 


2.41 


2.26 


2.66 


2.47 


2.45 


2.44 


• RCol 


13.1 


7.00 


3.94 


2.52 


2.32 


2.20 


2.58 


2.41 


2.38 


2.36 


• Skip 


22.1 


12.0 


6.92 


4.37 


3.11 


2.57 


3.74 


2.76 


2.14 


1.83 


• BR 


11.4 


8.02 


5.32 


3.60 


2.78 


2.60 


2.82 


1.59 


0.96 


0.63 


• FS 


13.1 


7.00 


3.94 


2.52 


2.31 


2.19 


2.58 


2.42 


2.39 


2.37 


• FFS 


13.1 


7.03 


3.96 


2.54 


2.34 


2.23 


2.66 


2.54 


2.58 


2.77 


• BFS 


13.2 


7.01 


3.95 


2.53 


2.34 


2.23 


2.68 


2.55 


2.59 


2.78 


• TS 


12.0 


11.6 


11.0 


9.94 


8.37 


6.39 


5.20 


3.87 


2.90 


2.68 


• SSABS 


9.75 


6.38 


4.02 


2.69 


2.43 


2.29 


2.66 


2.48 


2.44 


2.44 


•TVSBS 


9.83 


7.04 


4.75 


3.30 


2.73 


2.56 


2.67 


1.52 


0.92 


0.59 


• PBMH 


21.0 


11.0 


6.05 


3.58 


2.54 


2.36 


3.19 


2.91 


3.64 


6.68 


• FJS 


9.59 


6.30 


3.98 


2.69 


2.44 


2.31 


2.66 


2.48 


2.43 


2.44 


•HASH3 




17.9 


6.62 


3.40 


2.42 


2.28 


2.58 


1.72 


1.28 


1.13 


•HASH5 






12.1 


4.73 


2.63 


2.39 


2.72 


1.84 


1.37 


1.21 


•HASH8 








7.60 


3.44 


2.46 


2.86 


1.96 


1.44 


1.29 


•TSW 


16.3 


11.4 


7.42 


4.80 


3.33 


2.68 


3.34 


1.98 


1.26 


0.88 


•GRASPm 


14.6 


7.72 


4.31 


2.63 


2.36 


2.21 


2.64 


2.41 


2.26 


2.05 


• SSEF 










5.39 


3.37 


3.43 


1.80 


0.99 


0.55 


< RF 


21.0 


11.5 


6.70 


4.32 


3.07 


2.56 


3.05 


2.34 


2.51 


3.89 


BOM 


36.2 


26.6 


25.6 


19.0 


13.9 


9.40 


6.85 


3.71 


2.10 


1.36 


* B0M2 


22.2 


12.1 


7.10 


4.54 


3.18 


2.52 


2.99 


1.71 


1.24 


2.40 


• toUIVI 


18.4 


6.92 


3.72 


2.77 


2.59 


2.51 


2.76 


1.73 


1.41 


2.68 


• FROM 


11 5 


7 41 


4.70 


3 23 


2 76 


2 67 


2 86 


1 79 


1 45 


2 71 


*^SEBOM 


19.4 


7.32 


3.94 


2.88 


2.70 


2.62 


2.87 


1.83 


1.48 


2.74 


SFBOM 


11.5 


7.47 


4.76 


3.29 


2.83 


2.74 


2.95 


1.86 


1.52 


2.78 


SBDM 


24.9 


13.0 


7.03 


4.06 


2.69 


2.44 


3.11 


2.72 


2.68 


2.70 


•SO 


16 8 


16 8 


16 8 


16 8 


16 8 


22 1 


22 1 


22 1 


22 1 


22 1 


•SA 


16 4 


16 4 


16 4 


16 4 


16 4 


19 1 


19 1 


19 1 


19 1 


19 1 


V ..^ O 1 ^ 1 V 1 


48 


16 5 


7.47 


3 94 


2 42 


2 53 


2 52 


2 52 


2 52 


2 52 




21 7 


11.7 


6.75 


4 22 


2 87 


2 84 


2 84 


2 85 


2 84 


2 86 


•TNDMa 


19.5 


10.4 


5.94 


3.89 


2.92 


2.84 


2.85 


2.84 


2.84 


2.86 


• LBNDM 


30.4 


15.9 


8.75 


5.16 


3.28 


2.43 


3.00 


1.77 


1.21 


1.06 


• SVMl 


14.7 


10.5 


14.9 


10.5 


8.34 


20.7 


20.7 


20.7 


20.7 


20.7 


•SBNDM2 


34.9 


12.3 


5.85 


3.23 


2.42 


2.38 


2.38 


2.38 


2.38 


2.38 


• SBNDM-BMH 


12.4 


6.90 


4.08 


2.83 


2.41 


2.52 


2.52 


2.51 


2.52 


2.52 


• BMH-SBNDM 


13.2 


6.98 


3.92 


2.50 


2.30 


2.34 


2.35 


2.36 


2.36 


2.35 


• FA0S0q2 


17.3 


10.8 


10.7 


10.7 


10.2 


10.2 


10.2 


10.2 


10.2 


10.2 


• FA0S0q4 




12.0 


5.82 


5.72 


5.45 


5.45 


5.45 


5.46 


5.45 


5.46 


• A0S02 


15.8 


9.80 


9.73 


9.72 


9.72 


8.53 


8.52 


8.52 


8.52 


8.53 


• A0S04 




11.0 


5.18 


5.09 


5.08 


4.54 


4.55 


4.55 


4.55 


4.54 


• FSBNDM 


21.2 


10.9 


5.80 


3.32 


2.27 


2.27 


2.27 


2.26 


2.27 


2.27 


•BNDMq2 


33.0 


11.4 


5.31 


2.88 


2.39 


2.44 


2.44 


2.43 


2.43 


2.43 


•BNDMq4 




48.6 


10.5 


4.60 


2.57 


3.14 


3.14 


3.16 


3.15 


3.14 


• SBNDMq2 


32.3 


11.4 


5.40 


3.07 


2.39 


2.38 


2.38 


2.38 


2.38 


2.38 


• SBNDMq4 




45.7 


9.90 


4.37 


2.53 


2.44 


2.44 


2.45 


2.44 


2.44 


•UFNDMq2 


30.2 


15.4 


8.07 


4.43 


2.71 


2.70 


2.70 


2.70 


2.70 


2.70 


• DBWW 


17.4 


9.58 


5.73 


3.66 


3.65 


3.65 


3.65 


3.66 


3.66 


3.65 


• DBWW2 


17.3 


9.55 


5.73 


3.63 


3.64 


3.63 


3.63 


3.63 


3.63 


3.64 


•KBNDM 


40.2 


20.5 


10.8 


5.94 


3.54 


2.64 


3.01 


1.63 


1.47 


1.46 



In the case of very short patterns the FJS algorithm obtains the best perfor- 
mance. For short patterns the algorithms SBNDM-BMH and BMH-SBNDM are 
very fast. However other algorithms based on bit-parallelism obtain good results. 
In particular the FSBNDM algorithm is the fastest for patterns of length 32. For 
patterns of length 8 the EBOM algorithm obtains the best results. In the case of 
long patterns the FSBNDM algorithm obtains very good results. In some cases 
it is outperformed by the TVSBS algorithm. For very long patterns the best 
results are obtained by the SSEF and TVSBS algorithms. Regarding the overall 
performance the algorithms BR, and TVSBS maintain very good performance 
for all patterns. 
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2.7 Experimental Results on Randl28 Problem 

In this section wc present experimental results on a random text buffer over an 
alphabet of 128 characters. 



m 


2 


4 


8 


16 


32 


64 


128 


256 


512 


1024 


• BM 


19.9 


10.4 


5.65 


3.32 


2.44 


2.24 


2.68 


2.08 


2.11 


2.08 


• HOR 


24.2 


12.6 


6.74 


3.81 


2.53 


2.34 


3.06 


2.30 


2.36 


2.36 


• KR 


16.7 


16.4 


16.3 


16.4 


16.4 


16.4 


16.4 


16.4 


18.7 


17.4 


• ZT 


39.2 


20.3 


10.9 


6.16 


3.83 


2.77 


2.98 


1.62 


1.07 


0.60 


• OM 


18.5 


11.6 


6.99 


4.30 


2.83 


2.45 


2.83 


2.26 


2.35 


2.22 


•QS 


12.1 


7.64 


4.57 


2.86 


2.40 


2.21 


2.91 


2.21 


2.28 


2.05 


• TunBM 


13.1 


6.99 


3.85 


2.48 


2.29 


2.14 


2.58 


2.03 


2.06 


1.84 


• NSN 


13.8 


14.0 


13.9 


13.9 


13.9 


13.9 


14.0 


13.9 


15.9 


14.4 


• Raita 


15.1 


8.01 


4.40 


2.64 


2.38 


2.19 


2.66 


2.09 


2.13 


1.89 


• RCol 


12.8 


6.82 


3.80 


2.46 


2.27 


2.13 


2.58 


2.03 


2.04 


1.83 


• Skip 


21.4 


11.3 


6.29 


3.79 


2.71 


2.43 


3.28 


2.18 


1.90 


1.37 


•BR 


15.2 


10.6 


6.87 


4.43 


3.06 


2.74 


2.92 


1.59 


1.08 


0.60 


• FS 


12.8 


6.82 


3.79 


2.46 


2.29 


2.13 


2.57 


2.02 


2.04 


1.84 


• FFS 


12.8 


6.83 


3.81 


2.47 


2.32 


2.17 


2.66 


2.20 


2.44 


2.49 


• BFS 


12 8 


6 85 


3 80 


2 47 


2 31 


2 19 


2 66 


2 20 


2.44 


2 50 


• TS 


11.6 


11.4 


11.1 


10.5 


9.47 


7.96 


6.43 


5.01 


3.67 


2.54 


• SSABS 


9.34 


6.14 


3.81 


2.54 


2.39 


2.25 


2.61 


2.07 


2.08 


1.86 


•TVSBS 


13.6 


9.60 


6.30 


4.15 


2.95 


2.66 


2.77 


1.54 


1.05 


0.59 


• PBMH 


20.3 


10.6 


5.76 


3.38 


2.47 


2.27 


3.20 


2.54 


3.52 


6.30 


• FJS 


9.18 


6.04 


3.76 


2.54 


2.38 


2.25 


2.61 


2.04 


2.10 


1.86 


•HASH3 


- 


18.0 


6.62 


3.41 


2.44 


2.30 


2.57 


1.72 


1.45 


1.18 


•HASH5 


- 


- 


12.1 


4.71 


2.65 


2.38 


2.73 


1.85 


1.57 


1.26 


•HASH8 


- 


- 


- 


7.59 


3.44 


2.46 


2.85 


1.97 


1.65 


1.34 


• TSW 


18.8 


13.1 


8.52 


5.57 


4.05 


3.44 


3.67 


2.11 


1.47 


0.88 


•GRASPm 


14.2 


7.51 


4.13 


2.48 


2.35 


2.15 


2.64 


2.06 


2.06 


1.80 


• SSEF 


" 




" 


" 


5.40 


3.38 


3.43 


1.79 


1.14 


0.57 


• BOM 


35.5 


26.4 


26.6 


20.3 


16.0 


12.2 


9.55 


5.49 


3.45 


1.92 


• B0M2 


21.5 


11.4 


6.43 


3.93 


2.83 


2.54 


3.03 


1.72 


1.41 


2.48 


• ILDMl 


19.6 


10.4 


5.77 


3.51 


2.75 


2.59 


3.18 


3.02 


5.01 


7.90 


• EBOM 


26.0 


8.61 


4.65 


3.11 


2.69 


2.56 


2.83 


1.75 


1.62 


2.77 


• FBOM 


16.8 


10.7 


6.52 


4.16 


2.97 


2.83 


2.96 


1.83 


1.68 


2.79 


• SEBOM 


26.4 


8.82 


4.78 


3.23 


2.81 


2.67 


2.93 


1.83 


1.67 


2.79 


■ SFBOM 


16.8 


10.7 


6.60 


4.22 


3.04 


2.87 


3.06 


1.98 


1.77 


2.88 


SBDM 


24 2 


12 6 


6 75 


3 81 


2 52 


2 33 


3 05 


2 42 


2 37 


2 13 


• SO 


16 8 


16 9 


16 S 


16 8 


16 8 


22 1 


22 1 


25 3 


25 1 


22 8 


• SA 


16 4 


16 5 


16 4 


16 4 


16 4 


19 1 


19 1 


21 5 


21 8 


19 7 


• OD l\l I— ' IVI 


48 1 


16 6 


7 46 


3 94 


2 42 


2 36 


2 36 


2 66 


2 70 


2 43 


• 1 IM L/ IVI 


21 1 


11 2 


6 27 


3 74 


2 64 


2 62 


2 62 


2 94 


3 01 


2 71 


V 1 l\J IVId 


19 1 


9 95 


5 39 


3 28 


2 58 


2 51 


2 .52 


2 81 


2 87 


2 59 


• LBNDM 


29.8 


15.5 


8.29 


4.73 


3.01 


2.41 


2.91 


1.81 


1.18 


0.81 


• SVMl 


14.1 


9.97 


14.6 


10.2 


8.12 


20.7 


20.7 


23.3 


23.7 


21.4 


•SBNDM2 


34.9 


12.4 


5.84 


3.23 


2.43 


2.38 


2.38 


2.66 


2.70 


2.45 


• SBNDM~BMH 


11.8 


6.55 


3.71 


2.58 


2.29 


2.36 


2.36 


2.67 


2.69 


2.43 


• BMH-SBNDM 


12.8 


6.86 


3.78 


2.45 


2.28 


2.34 


2.33 


2.61 


2.67 


2.40 


• FNDM 


22.5 


12.1 


6.65 


3.93 


2.71 


2.63 


2.63 


2.95 


3.00 


2.72 


• FA0S0q2 


14.0 


10.8 


10.7 


10.7 


10.2 


10.2 


10.2 


11.4 


11.7 


10.5 


• FA0S0q4 




8.87 


5.76 


5.72 


5.44 


5.45 


5.46 


6.10 


6.20 


5.63 


• A0S02 


12.7 


9.74 


9.75 


9.74 


9.74 


8.53 


8.51 


9.58 


9.74 


8.79 


• A0S04 




8.02 


5.12 


5.10 


5.09 


4.55 


4.55 


5.09 


5.18 


4.70 


• FSBNDM 


20.9 


10.8 


5.74 


3.29 


2.26 


2.27 


2.25 


2.54 


2.57 


2.33 


• BNDMq2 


33.0 


11.4 


5.30 


2.88 


2.38 


2.42 


2.41 


2.72 


2.77 


2.50 


•BNDMq4 




48.7 


10.5 


4.58 


2.58 


3.15 


3.17 


3.52 


3.58 


3.25 


• SBNDMq2 


32.3 


11.4 


5.39 


3.06 


2.39 


2.37 


2.38 


2.65 


2.71 


2.46 


• SBNDMq4 




45.8 


9.89 


4.37 


2.54 


2.44 


2.44 


2.76 


2.79 


2.52 


•UFNDMq2 


30.2 


15.4 


8.07 


4.40 


2.69 


2.68 


2.67 


3.00 


3.06 


2.77 


• DBWW 


16.5 


8.85 


5.07 


3.17 


3.17 


3.18 


3.17 


3.88 


3.62 


3.29 


• DBWW2 


16.4 


8.82 


5.05 


3.16 


3.17 


3.17 


3.17 


3.84 


3.59 


3.27 


•KBNDM 


44.7 


22.7 


11.8 


6.38 


3.64 


2.64 


3.09 


2.07 


1.24 


1.13 



In the case of very short patterns the FJS algorithm have the best per- 
formance. For short patterns the algorithms SBNDM-BMH, BMH-SBNDM and 
FSBNDM obtain the best results. In the case of long patterns the algorithm 
FSBNDM is still a good choice. Very good results are obtained by the TVSBS 
and FS algorithms. For very long patterns the best results are obtained by the 
SSEF and TVSBS algorithms. Regarding the overall performance the algorithm 
FS shows good results for all patterns. Good results are also maintained over all 
patterns by the algorithm FJS. 
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2.8 Experimental Results on Rand256 Problem 

In this section wc present experimental results on a random text buffer over an 
alphabet of 256 characters. 



m 


2 


4 


8 


16 


32 


64 


128 


256 


512 


1024 


• DIVI 


20.3 


10.6 


5.70 


3.34 


2.49 


2.26 


2.67 


1 . 75 


1.33 


1.21 


• HOR 


24 7 


12 8 


6 83 


3 83 


2 54 


2 39 


2 98 


1 98 


1 54 


1 40 


• ZT 


47 7 


24 7 


13 3 


7 84 


5 44 


4 32 


3 87 


2 02 


1 08 


63 


• OM 


18 8 


118 


7 05 


4 30 


2 81 


2 48 


2 85 


1 94 


1 52 


1 49 


•QS 


12.3 


7.71 


4.60 


2.86 


2.46 


2.26 


2.83 


1.88 


1.47 


1.35 


• TunBM 


13.2 


6.98 


3.87 


2.51 


2.34 


2.17 


2.60 


1.73 


1.30 


1.16 


• NSN 


14.1 


14.1 


14.1 


14.1 


14.0 


14.1 


14.1 


14.1 


14.1 


14.1 


• Raita 


15.5 


8.13 


4.49 


2.67 


2.44 


2.23 


2.69 


1.79 


1.35 


1.20 


• RCol 


13.1 


6.91 


3.84 


2.53 


2.32 


2.18 


2.58 


1.73 


1.31 


1.18 


• Skip 


21.7 


11.3 


6.20 


3.63 


2.60 


2.38 


3.10 


1.87 


1.32 


1.02 


• BR 


19.9 


13.8 


9.06 


6.15 


4.75 


4.21 


3.78 


2.00 


1.10 


0.60 


• FS 


13.1 


6.90 


3.84 


2.50 


2.35 


2.17 


2.59 


1.72 


1.29 


1.18 


• FFS 


13.1 


6.92 


3.84 


2.54 


2.36 


2.23 


2.68 


1.91 


1.69 


1.91 


• BFS 


13.1 


6.90 


3.84 


2.52 


2.35 


2.23 


2.69 


1.94 


1.69 


1.92 


• TS 


11.7 


11.6 


11.5 


11.1 


10.5 


9.48 


8.08 


6.68 


4.51 


2.90 


• SSABS 


9.47 


6.19 


3.85 


2.57 


2.43 


2.29 


2.61 


1.75 


1.31 


1.18 


•TVSBS 


18.4 


12.8 


8.46 


5.88 


4.64 


4.15 


3.70 


1.98 


1.08 


0.60 


• PBMH 


20.7 


10.8 


5.80 


3.40 


2.52 


2.31 


3.13 


2.24 


2.58 


5.59 


• FJS 


9.26 


6.07 


3.76 


2.55 


2.44 


2.29 


2.62 


1.74 


1.32 


1.18 


•HASH3 


- 


18.5 


6.84 


3.50 


2.50 


2.35 


2.65 


1.77 


1.31 


1.17 


•HASH5 


- 


- 


12.4 


4.86 


2.72 


2.47 


2.81 


1.91 


1.42 


1.25 


•HASH8 


- 


- 


- 


7.83 


3.56 


2.53 


2.94 


2.03 


1.47 


1.33 


• TSW 


29.7 


22.2 


15.8 


11.1 


7.72 


5.47 


4.65 


2.56 


1.49 


0.92 


•GRASPm 


14.5 


7.60 


4.18 


2.50 


2.38 


2.19 


2.64 


1.75 


1.32 


1.16 


• SSEF 










5.58 


3.49 


3.55 


1.84 


1.03 


0.55 


• B0M2 


21.8 


11.5 


6.32 


3.77 


2.76 


2.61 


3.11 


1.85 


1.31 


2.41 


.11 n^yn 

• 1 LL* IVI J. 


19 9 


10 4 


5 65 


3 35 


2 73 


2 66 


3 22 


3 16 


4 65 


7 83 


• EBOM 


30.0 


10.4 


5.48 


3.72 


3.13 


2.99 


3.06 


1.90 


1.52 


2.77 


» SBDM 


24.7 


12.8 


6.81 


3.82 


2.54 


2.39 


3.00 


1.96 


1.53 


1.37 


• SA 


16 9 


16 9 


17 


17 


17 


19 7 


19 7 


19 7 


19 7 


19 7 


• OD l\l I— ' Ivl 


49 6 


17 


7 69 


4 05 


2 49 


2 36 


2 36 


2 35 


2 35 


2 36 


• TNDM 


21.6 


11.3 


6.17 


3.62 


2.59 


2.59 


2.59 


2.59 


2.58 


2.58 


•TNDMa 


19.5 


10.00 


5.33 


3.07 


2.52 


2.44 


2.40 


2.43 


2.44 


2.44 


• LBNDM 


30.4 


15.7 


8.33 


4.66 


2.87 


2.46 


2.91 


1.61 


0.96 


0.66 


• SVMl 


14.2 


9.98 


13.6 


10.4 


8.26 


21.4 


21.4 


21.4 


21.4 


21.4 


•SBNDM2 


36.0 


12.6 


6.03 


3.34 


2.49 


2.45 


2.45 


2.45 


2.45 


2.45 


• SBNDM-BMH 


11.9 


6.50 


3.63 


2.55 


2.30 


2.35 


2.35 


2.34 


2.36 


2.36 


• BMH-SBNDM 


13.1 


6.90 


3.83 


2.51 


2.32 


2.39 


2.38 


2.38 


2.39 


2.40 


• FNDM 


22.9 


12.0 


6.57 


3.76 


2.65 


2.59 


2.60 


2.60 


2.61 


2.60 


• FA0S0q2 


12.8 


11.1 


11.1 


11.1 


10.5 


10.5 


10.5 


10.5 


10.5 


10.5 


• FA0S0q4 




7.57 


5.92 


5.90 


5.62 


5.64 


5.62 


5.62 


5.62 


5.62 


• A0S02 


11.6 


10.0 


10.1 


10.0 


10.1 


8.80 


8.79 


8.80 


8.79 


8.80 


• A0S04 




6.81 


5.27 


5.25 


5.26 


4.69 


4.69 


4.70 


4.70 


4.70 


• A0S06 






5.20 


3.70 


3.69 


3.40 


3.41 


3.40 


3.40 


3.38 


• FSBNDM 


21.4 


11.1 


5.89 


3.37 


2.32 


2.32 


2.33 


2.33 


2.33 


2.34 


•BNDMq2 


34.0 


11.8 


5.45 


2.98 


2.45 


2.49 


2.48 


2.49 


2.49 


2.49 


• SBNDMq2 


33.3 


11.7 


5.57 


3.17 


2.46 


2.45 


2.46 


2.47 


2.45 


2.45 


• SBNDMq4 




47.2 


10.2 


4.51 


2.60 


2.52 


2.51 


2.51 


2.52 


2.51 


• DBWW 


16.6 


8.77 


4.88 


2.96 


2.96 


2.97 


2.95 


2.96 


2.97 


2.97 


• DBWW2 


16.6 


8.78 


4.90 


2.95 


2.96 


2.96 


2.96 


2.95 


2.96 


2.96 


•KBNDM 


46.6 


23.7 


12.5 


7.16 


4.75 


3.61 


3.42 


1.78 


0.96 


0.87 



In the case of very short patterns the best performance are obtained by the 
FJS algorithm. For short patterns the algorithms SBNDM-BMH, BMH-SBNDM 
and FS obtain the best results. In the case of long patterns the algorithms 
based on characters comparison are good choices, among them FS, GRASPm 
and TunBM. Very good results are obtained also by the LBNDM, KBNDM and 
FSBNDM algorithms. For very long patterns the best results are obtained by the 
SSEF, LBNDM and KBNDM algorithms. Regarding the overall performance the 
algorithm FS shows good results for all patterns. Good results are also main- 
tained over all patterns by the algorithms GRASPm and FJS. 
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2.9 Experimental Results on a Genome Sequence 



In this section we present experimental results on a genomic sequence which 
consists of 4 different characters. 



m 


2 


4 


8 


16 


32 


64 


128 


256 


512 


1024 


• KR 


51.1 


35.8 


33.8 


33.7 


33.7 


33.8 


33.8 


33.8 


33.8 


33.8 


•QS 


58.4 


44.6 


34.7 


31.7 


31.1 


31.2 


31.3 


31.7 


31.4 


31.1 


•TunBM 


66.5 


44.0 


35.0 


32.3 


31.7 


32.7 


32.6 


32.6 


32.1 


31.5 


• NSN 


00.1 


DO.U 


D.i.U 


Dl .D 


Dl. 1 


61 2 


Dl . 1 


Dl.D 


Dl.D 


Dl .o 


• Raita 


60.6 


37.4 


28.4 


26.2 


25.7 


26.6 


26.2 


25.8 


26.1 


26.0 


• RCol 


57.4 


39.5 


28.6 


23.7 


20.7 


17.7 


15.6 


14.4 


12.6 


11.3 


• ASkip 


112 


65.5 


31.7 


16.3 


9.67 


7.07 


8.41 


6.59 


6.61 


9.32 


• BR 


50.5 


37.2 


25.2 


17.0 


12.8 


11.3 


11.3 


11.4 


11.4 


11.4 


• FS 


57.4 


39.7 


28.9 


23.5 


20.4 


17.7 


15.8 


14.5 


12.9 


11.5 


• FFS 


56. 1 


37.9 


27. 1 


19.9 


14.9 


11.8 


10.4 


9.06 


7.47 


6.70 


• Dro 


57.2 


38.8 


26.9 


20. 1 


15.7 


12.6 


11.2 


9.89 


8.43 


7.62 


•TS 


56 


45 9 


33 1 


23 7 


18 5 


15 5 


13 7 


12 4 


10 9 


10 2 


•SSABS 


51.4 


42.5 


36.2 


34.2 


33.6 


33.7 


33.8 


33.9 


33.6 


34.4 


•TVSBS 


44.6 


35.3 


24.6 


17.5 


13.8 


12.7 


12.6 


12.6 


12.8 


12.5 


• FJS 


60.3 


50.6 


43.7 


41.8 


42.2 


42.0 


41.7 


42.4 


41.7 


41.3 


•HASH3 




39.8 


15.0 


7.93 


5.38 


4.96 


5.59 


5.22 


5.06 


5.03 


•H ASH5 






25. 1 


9.84 


5.54 


4.91 


5.66 


4.00 


3.04 


2.58 


_ U A C LJ Q 

•n Aono 








15.8 


7. 18 


5.09 


5.86 


4. 19 


3.13 


2.70 


• 1 ovV 


55.0 




27.6 


19.0 


14.6 


12.8 


12.9 


12.7 


12.6 


12.9 


• VjIaM jr m 


DO. 1 




OU. 1 


23 1 


1 7 Q 


14 2 


1 R 1 
ID. 1 


IZ.O 


Q Q7 

y.y / 


8 on 


• SSEF 


- 


- 


- 


- 


11.5 


6.22 


6.68 


3.75 


2.26 


1.59 


cAUT 


44.8 


46.0 


44.8 


46.1 


46.0 


44.8 


46.1 


45.0 


45.7 


47.1 


■ RF 


102 


62.5 


35.5 


20.1 


11.8 


7.53 


8.09 


5.22 


4.79 


6.02 


TRF 


112 


70.2 


41.2 


24.8 


15.1 


9.96 


9.64 


7.42 


7.39 


8.88 


■ BOM 


136 


90.2 


56.5 


36.6 


23.0 


14.1 


11.8 


6.52 


3.59 


2.05 


■ B0M2 


117 


66.7 


37.7 


21.2 


12.0 


7.20 


7.84 


4.37 


2.73 


3.20 


WW 


108 


67.9 


40.3 


23.5 


13.9 


8.71 


9.43 


6.54 


6.69 


9.86 


. ILDMl 


80.4 


53.9 


33.3 


19.9 


11.8 


7.73 


8.33 


6.10 


6.65 


9.94 


ILDM2 


89.7 


50.5 


28.5 


16.5 


9.90 


6.74 


7.95 


5.88 


6.51 


9.81 


EBOM 


49.8 


29.1 


21.3 


14.3 


9.03 


6.11 


7.03 


4.08 


2.74 


3.42 


FBOM 


60.2 


38.0 


24.8 


16.0 


10.2 


6.78 


7.59 


4.35 


2.84 


3.49 


SEBOM 


51.4 


29.4 


21.4 


14.5 


9.14 


6.32 


7.24 


4.19 


2.79 


3.47 


SFBOM 


57.5 


37.5 


24.1 


15.4 


9.68 


6.30 


7.22 


4.18 


2.80 


3.47 


• SO 


35.3 


35.3 


35.3 


35.3 


35.3 


44.8 


44.8 


44.8 


44.8 


44.9 


• SA 


OO.O 


oo.y 


OO.O 


OO.O 


OO.O 


oy.u 


oy.u 


oy.u 


oy.u 


oo.y 


• RNDM 

V 1 ^ I V 1 


102 


57. 1 


30 5 


16 5 


9 12 


12 1 


12 1 


12 


12 1 


12 1 


• BNDM-L 


101 


57.2 


30.6 


16.4 


9.13 


15.6 


19.9 


18.1 


17.8 


18.9 


•SBNDM 


102 


46.5 


26.2 


15.0 


8.44 


8.03 


8.02 


8.04 


8.03 


8.02 


• TNDM 


82.3 


52.8 


29.7 


16.6 


9.54 


9.32 


9.32 


9.29 


9.30 


9.31 


•TNDMa 


79.4 


52.7 


30.1 


16.9 


9.73 


9.47 


9.48 


9.44 


9.46 


9.43 


• LBNDM 


108 


62.5 


34.2 


19.1 


10.8 


8.18 


12.4 


26.4 


113 


111 


•SBNDM2 


81.5 


37.6 


22.3 


13.0 


7.75 


7.73 


7.73 


7.74 


7.74 


7.73 


• SBNDM-BMH 


67.4 


43.5 


26.2 


14.9 


8.44 


8.04 


8.01 


8.04 


8.03 


8.02 


• BMH-SBNDM 


61.3 


40.4 


26.2 


18.2 


11.9 


11.9 


11.8 


11.8 


11.8 


11.8 


• FA0S0q2 


199 


78.2 


26.1 


22.4 


20.5 


20.5 


20.5 


20.5 


20.5 


20.5 


• FA0S0q4 




163 


64.7 


15.5 


11.0 


11.0 


11.0 


11.0 


11.0 


11.0 


• A0S02 


211 


73.0 


23.4 


20.0 


20.0 


17.6 


17.6 


17.6 


17.6 


17.6 


• A0S04 




174 


61.1 


13.8 


10.4 


9.34 


9.34 


9.34 


9.33 


9.33 


• A0S06 






155 


57.5 


8.15 


7.62 


7.61 


7.62 


7.60 


7.61 


• FSBNDM 


80.9 


43.0 


23.2 


12.9 


7.06 


7.07 


7.05 


7.07 


7.06 


7.07 


•BNDMq2 


77.6 


37.6 


22.5 


13.0 


7.63 


9.49 


9.50 


9.48 


9.47 


9.48 


•BNDMq4 




101 


22.4 


10.2 


5.95 


7.34 


7.34 


7.35 


7.35 


7.33 


•BNDMq6 






49.8 


14.9 


7.28 


6.59 


6.58 


6.58 


6.58 


6.58 


• SBNDMq2 


76.5 


36.5 


22.0 


12.9 


7.63 


7.73 


7.73 


7.72 


7.72 


7.72 


• SBNDMq4 




94.9 


21.2 


9.76 


5.93 


5.55 


5.57 


5.56 


5.58 


5.58 


• SBNDMq6 






56.7 


16.6 


7.73 


6.64 


6.65 


6.66 


6.64 


6.64 


• SBNDMqS 






201 


23.6 


9.37 


8.64 


8.64 


8.63 


8.64 


8.65 


•UFNDMq4 


93.5 


45.1 


24.1 


13.6 


8.47 


8.45 


8.45 


8.48 


8.48 


8.46 


•UFNDMq6 


109 


58.2 


29.4 


15.4 


10.0 


10.1 


10.0 


10.0 


10.0 


10.1 


•KBNDM 


109 


58.1 


35.5 


21.6 


12.9 


8.06 


8.02 


8.00 


8.02 


7.99 



In the case of very short patterns the SA and EBOM algorithms obtain the 
best performance for patterns of length 2 and 4, respectively. In the case of 
short patterns the HASH3 algorithm achieves the best results. In the case of 
long patterns the algorithms in the HASHq family are still very good choices. 
They are sporadically outperformed by algorithms based on bit-parallelism. For 
very long patterns the best results are obtained by the SSEF. Regarding the 
overall performance the EBOM family of algorithms maintain good performance 
for all patterns. 
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2.10 Experimental Results on a Protein Sequence 



In this section we present experimental results on a protein sequence which 
consists of 20 different characters. 



m 


2 


4 


8 


16 


32 


64 


128 


256 


512 


1024 


•KR 


12.0 


10.9 


10.6 


10.6 


10.6 


10.7 


10.6 


10.7 


10.6 


10.7 


•ZT 


20.5 


10.6 


5.69 


3.35 


2 


14 


1.76 


2.02 


1.34 


0.99 


0.84 


•QS 


9.88 


6.32 


4.00 


2.71 


2.08 


1.92 


2.06 


1.97 


1.96 


1.94 


• TunBM 


11.1 


6.17 


3.70 


2.44 


1.97 


1.83 


1.90 


1.89 


1.88 


1.87 


• NSN 


1 n Q 


114 


1 1 T 


1 1 ^ 


11.4 


11^ 

1 i .0 


1 1 

11.0 


114 


114 


112 


• Raita 


111 


5 97 


3 56 


2 38 


1 


88 


1 76 


1 88 


1 87 


1 84 


1 82 


• RCol 


9.86 


5.46 


3.25 


2.19 


1 


82 


1.70 


1.81 


1.75 


1.72 


1.68 


• Skip 


17.0 


10.4 


6.73 


4.59 


3 


21 


2.49 


3.91 


2.79 


2.16 


1.93 


• BR 


8.61 


5.98 


3.90 


2.52 


1 


85 


1.74 


2.05 


1.36 


1.02 


0.87 


• FS 


9.88 


5.49 


3.26 


2.17 


1 


82 


1.70 


1.80 


1.77 


1.72 


1.69 


• FFS 


9.84 


5.52 


3.30 


2.18 


1 


79 


1.69 


1.88 


1.84 


1.77 


1.98 


• BFS 


9.94 


5.48 


3.25 


2.15 


1 


78 


1.68 


1.87 


1.85 


1.80 


2.01 


• TS 


9.41 


8.58 


7.31 


5.80 


4 


42 


3.36 


3.03 


2.72 


2.43 


2.20 


• SSABS 


8.01 


5.28 


3.57 


2.48 


2 


02 


1.87 


1.97 


1.93 


1.93 


1.93 


•TVSBS 


7.50 


5.28 


3.50 


2.34 


1 


85 


1.73 


1.92 


1.31 


0.98 


0.82 


• FJS 


8.21 


5.52 


3.70 


2.65 


2 


13 


1.96 


2.03 


1.99 


1.98 


1.98 


•HASH3 




11.8 


4.39 


2.29 


1 


61 


1.52 


1.74 


1.37 


1.17 


1.07 


■ H A9HR 

w It r\j n o 






7 85 


3 07 


1 


74 


1 56 


1 79 


1 22 


90 


80 


•HASH8 








4.96 


2 


24 


1.60 


1.87 


1.29 


0.96 


0.85 


•TSW 


9 82 


6 86 


4 55 


3 01 


2 


18 


1 84 


2 48 


1.71 


1 33 


1 16 




11 2 


6 10 


3 60 


2 35 


1 


86 


1 72 


1 90 


1 70 


1 33 


99 


• SSEF 


" 


" 


" 


" 


3 


57 


2.24 


2.29 


1.21 


0.63 


0.36 


' RF 


16.8 


10.2 


6.47 


3.93 


2.31 


1.78 


2.17 


1.91 


2.39 


4.19 


BOM 


26.5 


18.4 


14.3 


9.93 


6.36 


3.92 


3.22 


1.84 


1.06 


0.73 


• B0M2 


17.9 


10.8 


6.84 


4.12 


2.38 


1.74 


2.12 


1.29 


1.02 


2.22 


EBOM 


12.1 


4.71 


2.59 


1.91 


1.75 


1.70 


1.95 


1.34 


1.20 


2.57 


FBOM 


8 64 


5 36 


3 40 


2 29 


1.92 


1 85 


2 21 


1 41 


1 24 


2 59 




12 8 


5 01 


2 77 


2 03 


1.89 


1 81 


2 07 


1 46 


1 28 


2 61 


or D*.-' ivi 


8 68 


5 42 


3 44 


2 36 


1.99 


1 93 


2 27 


1 48 


1 29 


2 63 


m^n 

• ov^* 


ins 


ins 


ins 


1 n Q 
lu.y 


10.8 


14 2 


14 2 


14 2 


14 2 


14 2 


• SA 


10.7 


10.7 


10.7 


10.7 


10.7 


12.3 


12.3 


12.3 


12.3 


12.3 


•SBNDM 


31.2 


10.9 


5.02 


2.73 


1 


71 


1.72 


1.73 


1.72 


1.73 


1.72 


• TNDM 


16.1 


9.55 


5.93 


3.49 


1 


95 


1.94 


1.93 


1.94 


1.94 


1.95 


•TNDMa 


14.4 


8.67 


5.79 


3.66 


2 


00 


1.95 


1.95 


1.96 


1.94 


1.95 


• LBNDM 


22.0 


12.2 


7.08 


4.08 


2 


34 


1.70 


2.28 


1.53 


1.40 


1.49 


• SVMl 


11.9 


8.87 


10.2 


7.67 


5 


93 


13.5 


13.5 


13.5 


13.5 


13.5 


•SBNDM2 


22.8 


8.14 


3.95 


2.25 


1 


67 


1.60 


1.60 


1.59 


1.60 


1.60 


• SBNDM-BMH 


10.4 


6.30 


4.17 


2.66 


1 


73 


1.72 


1.72 


1.73 


1.72 


1.73 


• BMH-SBNDM 


10.1 


5.43 


3.18 


2.11 


1 


72 


1.77 


1.78 


1.80 


1.78 


1.78 


• FA0S0q2 


23.1 


8.04 


7.02 


7.01 


6 


64 


6.66 


6.65 


6.65 


6.65 


6.64 


• A0S02 




7.24 


6.32 


6.32 


6 


32 


5.56 


5.57 


5.55 


5.56 


5.57 


• A0S04 






4.23 


3.31 


3 


31 


2.97 


2.97 


2.95 


2.96 


2.97 


• A0S06 


2.33 


2.32 


16.1 


3.29 


2 


33 


2.15 


2.15 


2.16 


2.15 


2.16 


• FSBNDM 


15.3 


7.77 


4.15 


2.42 


1 


56 


1.56 


1.55 


1.54 


1.56 


1.56 


•BNDMq2 


21.4 


7.61 


3.63 


2.05 


1 


62 


1.74 


1.73 


1.75 


1.73 


1.73 


•BNDMq4 




31.5 


6.80 


2.99 


1 


68 


2.06 


2.05 


2.06 


2.07 


2.06 


• SBNDMq2 


21.1 


7.58 


3.70 


2.16 


1 


63 


1.59 


1.60 


1.59 


1.61 


1.60 


• SBNDMq4 




29.7 


6.44 


2.85 


1 


66 


1.61 


1.60 


1.60 


1.60 


1.61 


•UFNDMq2 


19.9 


10.4 


5.59 


3.18 


2 


01 


2.01 


2.01 


2.02 


2.02 


2.02 



In the case of very short patterns the SBNDM-BMH algorithm obtains the 
best performance. Other very good algorithms are SSABS, TVSBS and FJS. For 
short patterns the algorithms based on bit-parallelism achieves better results, in 
particular SBNDM2, FSBNDM, BNDMq2, SBNDMq2. The EBOM and SEBOM 
algorithms maintain also good performance. In the case of long patterns the 
algorithms EBOM and SEBOM are good choices. Very good results are obtained 
also by the HASHg algorithms, LBNDM and TVSBS. For very long patterns the 
best results are obtained by the SSEF, ZT, BR, HASHg and TVSBS algorithms. 
Among the algorithms based on automata the best results arc obtained by the 
BOM algorithm. Evaluating the overall performance the algorithms GRASPm, 
BR and TVSBS maintains good performance for all patterns. 
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2.11 Experimental Results on Bible 



In this section we present experimental results on a natural language text with 
63 different characters. 



m 


2 


4 


8 


16 


32 


64 


128 


256 


512 


1024 


• BM 


19.0 


10.8 


6.39 


4.17 


2.95 


2.46 


2.59 


2.29 


1.90 


1.59 


• nUK 


23.4 


13.1 


7.87 


4.87 


3.45 


2.75 


3.05 


2.58 


2.20 


1.91 


• KR 


14.6 


13.3 


13.1 


13.1 


13.1 


13.1 


13.1 


13.1 


13.1 


13.1 


• Z. 1 


26.4 


13.8 


7.52 


4.44 


2.87 


2.26 


2.55 


1.77 


1.27 


0.98 


•QS 


13 2 


8 90 


5 56 


3.66 


2 73 


2 32 


2 75 


2 30 


1 97 


1 70 


• TunBM 


14.8 


8.41 


5.11 


3.35 


2.59 


2.30 


2.55 


2.25 


1.95 


1.65 


• NSN 


14.5 


15.8 


15.2 


15.7 


15.5 


15.8 


16.1 


15.9 


16.4 


16.0 


• Smith 


24.9 


15.5 


9.29 


5 


79 


3.97 


3.03 


3.38 


2.81 


2.39 


2.08 


• Raita 


14.4 


7.90 


4.79 


3 


18 


2.43 


2.18 


2.46 


2.18 


1.86 


1.59 


• RCol 


12.9 


7.39 


4.59 


3 


04 


2.38 


2.13 


2.36 


2.07 


1.75 


1.47 


• BR 


11.3 


8.17 


5.28 


3 


41 


2.40 


2.17 


2.63 


1.78 


1.30 


1.01 


• FS 


12.8 


7.46 


4.58 


3 


09 


2.36 


2.13 


2.37 


2.09 


1.76 


1.47 


• FFS 


13.0 


7.57 


4.60 


3 


02 


2.33 


2.12 


2.39 


2.13 


1.92 


1.96 


• BFS 


13.1 


7.62 


4.52 


3 


01 


2.29 


2.08 


2.40 


2.14 


1.94 


1.99 


• TS 


12.7 


11.2 


8.97 


7 


40 


6.22 


5.68 


5.51 


5.70 


5.68 


5.68 


• SSABS 


10.8 


7.17 


4.88 


3 


43 


2.66 


2.33 


2.52 


2.25 


1.91 


1.65 


•TVSBS 


10.1 


7.16 


4.60 


3 


12 


2.35 


2.15 


2.42 


1.71 


1.21 


0.94 


• FJS 


11.1 


7.53 


5.05 


3 


53 


2.77 


2.45 


2.62 


2.32 


1.98 


1.69 


•HASH3 




14.6 


5.42 


2 


79 


1.97 


1.84 


2.09 


1.45 


1.06 


0.85 


• H A9HR 

• r 1 rAO ri ~j 






9 68 


3 


81 


2 14 


1 90 


2 21 


1 49 


1 07 


89 


W 1 1 r\J n o 








6 


08 


2.77 


1 97 


2 28 


1 58 


1 15 


98 


•TSW 


13.1 


9.32 


6.21 


4 


12 


2.91 


2.33 


3.14 


2.17 


1.63 


1.30 


9 \j rill 


14.7 


8 31 


4 97 


3 


31 


2 46 


2 16 


2 52 


2 12 


1 69 


1 32 


• SSEF 


" 


" 


" 






4.36 


2.59 


2.72 


1.44 


0.80 


0.45 


• BOM 


34.6 


25.2 


19.9 


13.9 


9.36 


6.03 


4.90 


2.90 


1.72 


1.10 


• B0M2 


23.6 


15.4 


9.71 


5.83 


3.51 


2.40 


2.83 


1.70 


1.22 


2.35 


• EBOM 


15.3 


6.53 


3.87 


2.91 


2.47 


2.21 


2.55 


1.68 


1.41 


2.67 


• FBOM 


11.8 


7.51 


4.89 


3.51 


2.80 


2.44 


2.85 


1.77 


1.44 


2.69 


• SEBOM 


16.2 


6.84 


4.10 


3.07 


2.61 


2.33 


2.67 


1.80 


1.48 


2.71 


brBUM 


11.6 


7.46 


4.94 


3.59 


2.83 


2.49 


2.93 


1.84 


1.48 


2.73 


SBDM 


23.5 


13.4 


7.74 


4 


89 


3.44 


2.77 


3.06 


2.58 


2.21 


1.91 


• SO 


13.3 


13.3 


13.3 


13.3 


13.3 


17.6 


17.6 


17.6 


17.6 


17.6 


• SA 


13.1 


13.1 


13.1 


13.1 


13.1 


15.3 


15.3 


15.3 


15.3 


15.3 


•SBNDM 


38.3 


14.0 


6.77 


3 


94 


2.55 


2.58 


2.57 


2.56 


2.56 


2.56 


• LBNDM 


28.3 


16.4 


9.59 


5 


57 


3.32 


2.19 


2.90 


1.94 


1.65 


1.48 


• SVMl 


16.1 


12.3 


13.2 


9 


77 


7.46 


16.9 


16.9 


16.9 


16.9 


16.9 


•SBNDM2 


28.4 


10.5 


5.36 


3 


30 


2.39 


2.26 


2.23 


2.25 


2.23 


2.24 


• SBNDM-BMH 


13.8 


8.80 


6.02 


3 


84 


2.54 


2.54 


2.56 


2.57 


2.58 


2.57 


• BMH-SBNDM 


13.2 


7.41 


4.46 


2 


94 


2.28 


2.31 


2.31 


2.31 


2.30 


2.29 


• A0S02 


30.8 


10.2 


7.86 


7 


76 


7.76 


6.82 


6.81 


6.82 


6.81 


6.81 


• A0S04 




26.5 


5.93 


4 


09 


4.07 


3.62 


3.62 


3.62 


3.63 


3.64 


• FSBNDM 


19.8 


10.3 


5.74 


3 


56 


2.20 


2.18 


2.20 


2.18 


2.18 


2.19 


•BNDMq2 


26.8 


10.1 


5.06 


3 


15 


2.30 


2.67 


2.70 


2.71 


2.70 


2.69 


•BNDMq4 




38.8 


8.48 


3 


79 


2.16 


2.66 


2.67 


2.68 


2.67 


2.67 


•BNDMq6 






19.2 


5 


79 


2.83 


2.58 


2.59 


2.57 


2.58 


2.58 


• SBNDMq2 


26.3 


9.87 


5.14 


3 


21 


2.37 


2.24 


2.23 


2.21 


2.24 


2.25 


• SBNDMq4 




36.5 


8.02 


3 


61 


2.14 


2.04 


2.05 


2.05 


2.05 


2.05 


• SBNDMq6 






21.9 


6 


45 


3.02 


2.59 


2.58 


2.59 


2.59 


2.59 



In the case of very short patterns the best results are obtained by the TVSBS 
and EBOM algorithms. For short patterns the EBOM algorithm obtains the best 
results for patterns of length 8, while in the other cases the best results are 
obtained by the HASHq algorithm. In the case of long patterns the algorithms 
in the HASHg family are good choices. Very good results are obtained also by 
the SSEF and SBNDMq4 algorithms. For very long patterns the best results 
are obtained by the SSEF algorithm. Evaluating the overall performance the 
algorithm TVSBS maintains good performance for all patterns. 
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2.12 Experimental Results on Worldl92 



In this section wc present experimental results on a natural language text with 
94 different characters. 



m 


2 


4 


8 


16 


32 


64 


128 


256 


512 


1024 


• BM 


11.1 


6.08 


3.57 


2 


22 


1.56 


1 


31 


1 


50 


1.21 





93 


0.77 


• HOR 


13.7 


7.41 


4.26 


2 


59 


1.76 


1 


44 


1 


72 


1.33 


1 


08 


0.89 


• KR 


8.59 


8.13 


8.01 




99 


7.98 


g 


00 


g 


01 


8.02 


g 


02 


8.01 


• ZT 


15.6 


8.13 


4.46 


2 


68 


1.77 


1 


44 


1 


60 


1.08 





76 


0.56 


• OM 


10.9 


7.00 


4.50 


2 


98 


2.06 


1 


59 


1 


68 


1.41 


1 


17 
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In the case of very short patterns the TVSBS and the EBOM algorithms 
obtain the best performance. In particular the TVSBS algorithm is the fastest 
for patterns of length 2 while the EBOM algorithm obtains the best results for 
pattern of length 8. For short patterns the algorithms EBOM, SBNDM-BMH and 
HASHq' obtain good performance for patterns of length 8, 16 and 32, respectively. 
In the case of long patterns the algorithms HASH^/ are the best algorithms. 
However sporadically they are outperformed by the FSBNDM algorithm. For very 
long patterns the best results are obtained by the SSEF algorithm. Evaluating 
the overall performance the algorithms TVSBS, SSABS and FS maintain good 
performance for all patterns. 
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Fig. 5. Experimental map of the best results obtained in our evaluation. Comparison 
based algorithms are presented in red gradations, automata based algorithms are pre- 
sented in green gradations and bit parallel algorithms are presented in blu gradations. 



3 Overall Discussion 

We performed comparisons between 85 exact string matching algorithms with 12 
text of different types. We divide the patterns into four classes according to their 
length m: very short (m < 4), short (4 < m < 32), long (32 < m < 256) and 
very long (m > 256). We proceed in the same way for the alphabet according 
to their size a: very small {a < 4), small (4 < cr < 32), large (32 < <t < 128) 
and very large (a > 128). According to our experimental results, wc conclude 
that the following algorithms arc the most efficient in the following situations 
(see Fig. 5): 

— SA: very short patterns and very small alphabets. 

— TVSBS: very short patterns and small alphabets, and long patterns and large 
alphabets. 

— FJS: very short patterns and large and very large alphabets. 

— EBOM: short patterns and large and very large alphabets. 

— SBNDM-BMH and BMH-SBNDM: short patterns and very large alphabets. 

— HASH (7: short and large patterns and small alphabets. 

— FSBNDM: long patterns and large and very larghe alphabets. 

— SBNDMq: long pattern and small alphabets. 

— LBNDM: very long patterns and very large alphabets. 

— SSEF: very long patterns. 

Among these algorithms all but one (the SA algorithm) have been designed 
during the last decade, four of them arc based on comparison of characters, 
one of them is based on automata (the EBOM algorithm) while six of them are 
bit-parallel algorithms. 
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